Zero-Shot Object Detection

  • ZSL based on bounding box features

    • [1]: use background bounding boxes from background classes
    • [3]: classification loss with semantic clustering
  • End-to-end zero-shot object detection

    • [2]: extend YOLO, concatenate three feature maps to predict confidence score.
    • [4]: use polarity loss similar to focal loss and vocabulary to enhance word vector
    • [5]: output both classification scores and semantic embeddings
  • Feature generation

    • [6]: synthesize
      visual features for unseen classes

    • [7]: semantics-preserving graph propagation modules that enhance both category and region representations

Reference

[1] Ankan Bansal, Karan Sikka, Gaurav Sharma, Rama Chellappa, Ajay Divakaran, “Zero-Shot Object Detection”, ECCV, 2018.

[2] Pengkai Zhu, Hanxiao Wang, and Venkatesh Saligrama, “Zero Shot Detection”, T-CSVT, 2019.

[3] Rahman, Shafin, Salman Khan, and Fatih Porikli. “Zero-shot object detection: Learning to simultaneously recognize and localize novel concepts.” arXiv preprint arXiv:1803.06049 (2018).

[4] Rahman, Shafin, Salman Khan, and Nick Barnes. “Polarity Loss for Zero-shot Object Detection.” arXiv preprint arXiv:1811.08982 (2018).

[5] Demirel, Berkan, Ramazan Gokberk Cinbis, and Nazli Ikizler-Cinbis. “Zero-Shot Object Detection by Hybrid Region Embedding.” arXiv preprint arXiv:1805.06157 (2018).

[6] Hayat, Nasir, et al. “Synthesizing the unseen for zero-shot object detection.” Proceedings of the Asian Conference on Computer Vision. 2020.

[7] Yan, Caixia, et al. “Semantics-preserving graph propagation for zero-shot object detection.” IEEE Transactions on Image Processing 29 (2020): 8163-8176.